Why Every AI PM Needs to Run Evals | Aman Khan, Arize AI Head of Product (Ex. Spotify, Apple, Cruise)

Update: 2025-08-05

Description

In this episode of Future Proof, we sit down with Aman Khan, the Head of Product at Arize AI. Aman reveals why traditional product metrics fail for AI systems and shares Arize's framework for building evaluation systems that actually predict real-world AI performance, plus the emerging PM skills that separate successful AI products from failed experiments.

We discuss:

How AI builders should think about evaluations
The role of the AI pm and how product management is evolving
How you should build with the expectations of foundation models changing.

(0:00 ) Highlights

(0:37 ) Intro

(1:40 ) What is an AI pm

(4:10 ) How PMs are evolving with AI

(8:10 ) The Aha moment in AI

(11:50 ) What AI builders should think about evaluations

(19:40 ) How AI builders best leverage their time in AI evaluations

(23:40 ) Prompt iteration - if your evaluations are not ideal, how do you iterate?

(27:40 ) What’s the minimum viable eval someone should write

(30:40 ) How would prioritization change based on the future of AI models

(36:40 ) Final thoughts

(38:00 ) Ethan's reflection

Ship integrations 7x faster https://www.useparagon.com/

Watch all Future Proof episodes: https://www.useparagon.com/future-proof

Comments

In Channel

Inside Gong’s AI Strategy: Context, Memory and Human Trust with Gong's VP of Product, Noaa Ilani

2025-11-2531:41

Deploying AI systems that enterprise clients trust, with ada.cx Chief Product and Technology Officer

2025-11-0431:29

Inside Cohere's Approach to Enterprise AI Agents | Elliott Choi | Director of Product, Cohere

2025-10-1427:29

How to add AI features, without destroying your core app, CPO at Crunchbase, Megh Gautam

2025-09-2430:34

Why the best User Experience will Decide the AI Winners | Anant Bhardwaj, Founder & CEO of Instabase

2025-09-0227:43

How Descript designs AI interfaces that feel natural | Laura Burkhauser, VP of Product at Descript

2025-08-1832:41

Why Every AI PM Needs to Run Evals | Aman Khan, Arize AI Head of Product (Ex. Spotify, Apple, Cruise)

2025-08-0539:43

Why MCP products struggle with enterprise adoption | Michael Grinich, CEO and Founder WorkOS

2025-07-2232:21

Beating OpenAI - How to build your Niche | Richard Socher, CEO and Cofounder You.com

2025-06-1630:58

How Box builds for Enterprise AI | Aaron Levie

2025-05-2733:19

00:00

1.0x

Why Every AI PM Needs to Run Evals | Aman Khan, Arize AI Head of Product (Ex. Spotify, Apple, Cruise)

#box-pro-ellipsis-176523684412737{-webkit-line-clamp:2;}Why Every AI PM Needs to Run Evals | Aman Khan, Arize AI Head of Product (Ex. Spotify, Apple, Cruise)

Why Every AI PM Needs to Run Evals | Aman Khan, Arize AI Head of Product (Ex. Spotify, Apple, Cruise)

Paragon

Why Every AI PM Needs to Run Evals | Aman Khan, Arize AI Head of Product (Ex. Spotify, Apple, Cruise)